GLMM Maternal Mortality Federally- Summer 2025

This is a Report Template

Author

Carolyn Herrera & Catherine Funte (Advisor: Dr. Cohen)

Published

July 30, 2025

Slides: slides.html ( Go to slides.qmd to edit)

Important

Remember: Your goal is to make your audience understand and care about your findings. By crafting a compelling story, you can effectively communicate the value of your data science project.

Carefully read this template since it has instructions and tips to writing!

Introduction

Generalized Linear Mixed Models (GLMMs) are a flexible class of statistical models that combine the features of two powerful tools: Generalized Linear Models (GLMs) and Mixed-Effects Models(Agresti 2015). Like GLMs, GLMMs can model non-normal outcome variables, such as binary, count, or proportion data. However, they go a step further by incorporating random effects, which account for variation due to grouping or clustering in the data, correlated observations, and overdispersion. The GLMM method can work with complex data structures such as temporal correlation, spatial correlation , nested data, heterogeneity of variance, and repeated measurement(Zuur et al. 2009).

In practical terms, GLMMs are especially useful when data points are not independent, such as when students are nested within schools, patients are treated within hospitals, or repeated measures are taken from the same subject over time. For example, Thall wrote that issues with longitudinal clinical trial basic count data from repeated measures taken from the same subject over time will have problems detecting comparable between subject outcomes because it can be difficult to determine if outcomes are time dependent or due to treatment groups, thus a general linear mixed model method may be utilized to represent dependence upon each patient, incorporate covariate data, create time as a function, account for variability between patients,and be flexible and tractable (Thall 1988). The random effects help model the correlation within clusters and allow for unobserved heterogeneity—differences that are not captured by the measured covariates.

GLMMs are good for:

  • Handling hierarchical or grouped data (e.g., students within classrooms, patients within clinics)(Lee and Nelder 1996)

  • Modeling non-normal outcomes, such as:

  • Improving inference by accounting for both fixed effects (predictors of interest) and random effects (random variation across groups)

  • Reducing bias and inflated Type I error rates that can result from ignoring data structure(Thompson et al. 2022)

GLMMs are ideal when your data is both complex in structure and involves non-Gaussian response variables, making them indispensable in fields like medicine, ecology, education, and social sciences. Tawiah et al describes zero-inflated Poisson GLMMs, an extension of Poisson GLMM that allows for overdispersion due to a prevalence of zeros in the data, common in health sector data(Tawiah, Iddi, and Lotsi 2020). The paper compares a Poisson GLM, a zero-inflated Poisson GLM, a Poisson GLMM, and a zero-inflated Poisson GLMM, applied to clustered maternal mortality data. Another paper by Owili et al utilizes a GLMM to investigate the impact of particulate matter on maternal and infant mortality globally (Owili et al. 2020). They use a Poisson link function and take year and country as random effects to account for differences in global data.

We wish to analyze the data of federal maternal mortality deaths via VSRR Provisional Maternal Death Counts and Rates dataset by utilizing a General linear mixed model with Poisson link as it is count data. We wish to see if ethnicity (a fixed effect) has any influence upon maternal death per live births(number of live births per 12 month period or year) count by year(random effect). Like other public health or clinical data there will be issues such as correlated observations and overdispersion but GLMM will be utilized to parse through the noise and determine if indeed there are some patterns of maternal mortality among mothers of differing ethnic ties.

Methods

Math Background

GLMMs can be considered an extension of GLMs, wherein a GLM includes the addition of random effects, or an extension of Linear Mixed Models (LMMs), where a linear model with fixed and random effects is extended for non-normal distributions(Salinas Ruı́z et al. 2023). Let

  • \(\mathbf{y}\) be a \(Nx1\) column vector outcome variable

  • \(\mathbf{X}\) be a \(Nxp\) matrix for the \(p\) predictor variables

  • \(\boldsymbol{\beta}\) be a \(px1\) column vector of the fixed effects coefficients

  • \(\mathbf{Z}\) is a \(Nxq\) matrix of the \(q\) random effects

  • \(\mathbf{u}\) is a \(qx1\) vector of random effects, and

  • \(\boldsymbol{\epsilon}\) is a \(Nx1\) column vector of the residuals

Then the general equation for the model is given by:

\[\mathbf{y}=\mathbf{X}\boldsymbol{\beta}+\mathbf{Z}{u}+\boldsymbol{\epsilon}\] (Salinas Ruı́z et al. 2023). The GLMM Model process is that the analyis of variance model or the equation is a vector of linear predictors with of unknown parameters estimates. Each distribution has is its own probability function which we will utilize the Negative Binomial as GLMMs typically include a link function that relates the response variable \(\mathbf{y}\) to a linear predictor, \(\eta\), which excludes the residuals. So then \[\boldsymbol{\eta}=\mathbf{X}\boldsymbol{\beta}+\mathbf{Z}\boldsymbol{\lambda}\] The link function is \(g(\cdot)\), where \[g(E(\mathbf{y}))=\boldsymbol{\eta}\] where \(E(\mathbf{y})\) is the expectation of . The choice of link function depends on the outcome distribution. For this paper our data demonstrates a Negative Binomial distribution for overdispered count data, so we will use a log link function.
\[g(\cdot)=log_e(\cdot)\] In the GLMM model the parameter estimates is solved by reducing the negative log likelihood functions(Salinas Ruı́z et al. 2023). The means or the least square means are derivative of the parameter estimates and are found on the model scale. The link function, negative binomial log link, will convert the mean estimates at the model scale to the data scale. Negative Binomial Distribution: \[ f(y;k,{\mu})=\frac{\Gamma(y+k)}{\Gamma(k)*(y+1)}\left(\frac{k}{\mu+k}\right)^{k}\left(1-\frac{k}{\mu+k}\right)^{y} \] Zuur writes that the negative binomial Distribution has two parameters \({\mu}\) and \(k\) (Zuur et al. 2009). The symbol \({\Gamma}\) is defined as \({\Gamma(y+1)=(y+1)!}\) The Mean of Negative Binomial is given: \(E(Y)= {\mu}\) The Variance of Negative binomial is given; \(Var(Y)= {\mu}+ \left(\frac{\mu^2}{k}\right)\), where second term determines the overdispersion, \(k\) is called the dispersion parameter and indirectly determines overdispersion. If \(k\) is significantly large relative to \({\mu^2}\) then the second term will approximate to zero and a Poisson distribution may as well be used. However, the smaller the \(k\) value the larger the overdispersion may form and then negative binomial is the correct log link to utilize.

GLMM model process is

Assumptions

Before deciding to use a GLMM for our data, we had to check some assumptions (specific to our negative binomial distributed data).

  • The response variable and the predictors have a linear relationship within the levels of random effects.
  • The response variable is assumed to follow a negative binomial distribution, with \(\sigma^2>\mu\).
  • The residuals and random effects are independent.
  • The random effects are assumed to be normally distributed, with mean 0 and variance \(\sigma\).

Why Chosen for

The Negative Binomial distribution is ideal for overdispersed count data in a fashion the poisson distribution cannot accommodate due Negative Binomial distribution’s quadratic nature of mean-variance relationship. Overdispersion is a given for populations due to their heterogeneity that aggregation processes such as community clusters form with shared traits such as ethnicity. Our data includes community cluster groups such as ethnicity and age groups. The negative binomial model is desirable also due to its easily intrepretable dispersion parameter as a measure of aggregation and its tractability due to closed form expression of probability mass function which helps in convenient model inference and estimation. Such as our data is longitudinal a Negative Binomial GLMM is ideal since measurements are taken over time. We incoporate Year as a random effect because it could explain for variation in our model which could not be explained by our fixed effects such as ethnicity or age groups.

WE shall analyze by using R(R Core Team 2025) and the packages lme4{Bates et al. (2015)} with function glmer to investigate poisson model or glmer.nb() for negative binomial model and the package glmmTMB{McGillycuddy et al. (2025)} with the function, nbinom2(), for negative binomial model. The approximate parameter estimation method uses for glmmTMB uses AGQ or Laplace approximation with Wald Z significance test. The approximate parameter estimation method for lme4 default is Laplace approximation when no quadrature point is used but is pseudo-likelihood method in linearization,and integral approximation the and another estimation method is Gauss-Hermite quadrature to approximate the log-likelihood using numerical integration(Salinas Ruı́z et al. 2023).

Densities are seen as counts per volume and can be modeled with NB and offset variable(Zuur et al. 2009). We are investigation maternal deaths per live births and while we can use the variable rate, it would be less noisy if we used the offset variable log(Live Births). We can also model without the offset and utilize Akaike information criterion,AIC, to see which is the better model as the smaller the AIC would indicate a better model. A random effect could be month,year, or month/year. Fixed effects could be age groups, ethnicity, and perhaps year. Our choice of random or fixed effects are dependent on the question we which to investigate.

Analysis and Results

Data Exploration and Visualization

The data we used come from the National Vital Statistics System, an organization that provides national data about births, deaths, marriages, divorces, and fetal deaths. It is a collaboration between the National Center for Health Statistics (NCHS) (a part of the CDC) and state vital records offices. {create citation https://www.cdc.gov/nchs/nvss/about_nvss.htm} This dataset is the Vital Statistics Rapid Release (VSRR) Provisional Maternal Death Counts and Rates, in the form of a .csv file. The dataset contains monthly death counts and death rates from 2019 to 2024 by race and ethnicity, age and overall. Rates are maternal deaths per 100,000 live birthd. For this dataset, a maternal death is defined as “the death of a woman while pregnant or within 42 days of termination of pregnancy irrespective of the duration and the site of the pregnancy, from any cause related to or aggravated by the pregnancy or its management, but not from accidental or incidental causes.” {create citation https://www.cdc.gov/nchs/nvss/vsrr/provisional-maternal-deaths-rates.htm}

  • Present initial findings and insights through visualizations.

  • Highlight unexpected patterns or anomalies.

Code
# loading packages 
library(tidyverse)
library(knitr)
library(ggthemes)
library(ggrepel)
library(dslabs)
library(ggplot2)
library(patchwork) # For combining plots
library(ggpubr)
library(DataExplorer) #For EDA
options(repos = c(CRAN = "https://cloud.r-project.org"))#specify cran mirror
if (!require("lme4")) install.packages("lme4")#if required to install
library(lme4)
##to check if model is overdispersed'
if (!require("performance")) install.packages("performance")#if required to install
library(performance)
##Overdispersion is detected with Poisson model thus we must use negative binomial model
if (!require("glmmTMB")) install.packages("glmmTMB")#if required to install
library(glmmTMB)
Code
df <- read.csv("data/VSRR_Provisional_Maternal_Death_Counts_and_Rates_20250726.csv")#loading data
#variables- Maternal.Deaths--outcome variable
#Live.Births---offset variable since it is maternal deaths per live births
#Subgroup- ethnicity and year of mother must be edited --- these are fixed
#Year.of.Death---random effect variable
#Summary
#renaming columns
df<-df%>% 
  rename(Year= Year.of.Death, Month= Month.of.Death, Maternal_Deaths= Maternal.Deaths, Live_Births= Live.Births, Maternal_Mortality_Rate= Maternal.Mortality.Rate, Time_Period= Time.Period, Month_Ending_Date= Month.Ending.Date, Data_As_Of= Data.As.Of) 
  
filtered_df<- df%>% filter(Maternal_Deaths > 0, Live_Births > 100) #Filtering out rare combinations
  
test_df <- na.omit(filtered_df)#filtered na
Code
#Summary Statistics

deaths_df <-test_df %>%
  mutate(
    Ethnicity = case_when(
      Group == "Race and Hispanic origin" ~ Subgroup,
      TRUE ~ NA_character_
    ),
    Age_Group = case_when(
      Group == "Age" ~ Subgroup,
      TRUE ~ NA_character_
    ),
    Is_Total = Subgroup == "Total"
  ,
  Year = as.factor(Year),
  Month= as.factor(Month),
  Ethnicity=as.factor(Ethnicity),
  Age_Group=as.factor(Age_Group))#mutating groups

deaths_df_ethnic<-deaths_df %>%
  filter(!is.na(Ethnicity))%>%
  mutate(Ethnicity = Subgroup)###filter out is total if ethnicity or age group are na

deaths_df_age <-deaths_df %>%
  filter(!is.na(Age_Group)) %>%
  mutate(Age_Group = Subgroup)


merged_deaths_df <-  bind_rows(deaths_df_ethnic, deaths_df_age)  #merge both df

#setting factor levels
deaths_df3 <- merged_deaths_df %>%
  mutate(
    Ethnicity = factor(Ethnicity, levels = c(
      "Asian, Non-Hispanic", "Black, Non-Hispanic", "White, Non-Hispanic",
      "Hispanic", "American Indian or Alaska Native, Non-Hispanic",
      "Native Hawaiian or Other Pacific Islander, Non-Hispanic", "Unknown"
    )),
    Age_Group = factor(Age_Group, levels = c("Under 25 years", "25-39 years", "40 years and over", "Unknown"))
  )

#mutating any NAs with unknown
deaths_df3<- deaths_df3 %>%
   mutate(
    Ethnicity = replace_na(Ethnicity, "Unknown"),
    Age_Group = replace_na(Age_Group, "Unknown")
  )

# make variable in which we combine Year and Month into a for of a date to create later variable, or use month ending date?
deaths_df3$Date <- as.Date(paste(deaths_df3$Year, deaths_df3$Month, "01", sep = "-"))

# Dobbs was decided June 24, 2022
cutoff <- as.Date("2022-06-24")##when Roe v.Wade was overturned

deaths_df3$Dobbs_Era <- ifelse(deaths_df3$Date < cutoff, "Pre-Dobbs", "Post-Dobbs")
deaths_df3$Dobbs_Era <- factor(deaths_df3$Dobbs_Era, levels = c("Pre-Dobbs", "Post-Dobbs"))


df_age_year <- deaths_df %>%
  filter(!is.na(Age_Group), !is.na(Year)) %>%
  group_by(Year, Age_Group) %>%
  summarise(Maternal_Deaths = sum(Maternal_Deaths, na.rm = TRUE))
df_age_year
# A tibble: 21 × 3
# Groups:   Year [7]
   Year  Age_Group         Maternal_Deaths
   <fct> <fct>                       <int>
 1 2019  25-39 years                  6017
 2 2019  40 years and over            1139
 3 2019  Under 25 years               1320
 4 2020  25-39 years                  6824
 5 2020  40 years and over            1361
 6 2020  Under 25 years               1312
 7 2021  25-39 years                  8338
 8 2021  40 years and over            1975
 9 2021  Under 25 years               1558
10 2022  25-39 years                  9191
# ℹ 11 more rows
Code
df_total_year <-deaths_df%>%
  filter(Group=="Total") %>%
  group_by(Year, Is_Total) %>%
  summarise(Maternal_Deaths = sum(Maternal_Deaths, na.rm = TRUE))
df_total_year
# A tibble: 7 × 3
# Groups:   Year [7]
  Year  Is_Total Maternal_Deaths
  <fct> <lgl>              <int>
1 2019  TRUE                8482
2 2020  TRUE                9503
3 2021  TRUE               11871
4 2022  TRUE               13022
5 2023  TRUE                8554
6 2024  TRUE                8077
7 2025  TRUE                1971
Code
#deaths_cleaned <- deaths_df3 %>%
#  filter(Age_Group != "Unknown",Ethnicity != "Unknown") %>%
 # droplevels()

#deaths_df2<-deaths_df %>%
#  filter(!(Is_Total & (is.na(Ethnicity) | is.na(Age_Group))))###filter out is total if ethnicity or age when total is true
Code <<<<<<< HEAD
#Data Exploration Graphs
df_ethnicity <- deaths_df %>%
    filter(!is.na(Ethnicity)) %>%
    group_by(Ethnicity) %>%
    summarise(Maternal_Deaths = sum(Maternal_Deaths, na.rm = TRUE))
barplot(df_ethnicity$Maternal_Deaths,names.arg=c('Am Ind/Al Nat', 'Asian, NH','Black, NH','Hispanic','White,NH'),main="Maternal Mortality Totals 2019-2024 by Ethnicity")
=======
plot_missing(deaths_df)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
<<<<<<< HEAD

Code
df_tot_year_ts<-ts(df_total_year$Maternal_Deaths, start = c(2019), frequency = 1)
plot.ts(df_tot_year_ts,main='Yearly Total Maternal Deaths 2019-2024',xlab='Year',ylab='Maternal Deaths')

Code
total_ts <- deaths_df%>%
  filter(Group=='Total')
total_ts <- ts(total_ts$Maternal_Deaths,start=c(2019,1),frequency=12)
plot.ts(total_ts,main='Monthly Total Maternal Deaths 2019-2024',ylab='Maternal Deaths',xlab='Time')  

Code
His <- filtered_df%>%
  filter(Subgroup=='Hispanic')%>%
  select(Maternal_Deaths)%>%
  rename(Hispanic=Maternal_Deaths)
bl <- filtered_df%>%
  filter(Subgroup=='Black, Non-Hispanic')%>%
  select(Maternal_Deaths)%>%
  rename('Black_Non-Hispanic' = Maternal_Deaths)
wh <- filtered_df%>%
  filter(Subgroup=='White, Non-Hispanic')%>%
  select(Maternal_Deaths)%>%
  rename('White_Non-Hispanic'=Maternal_Deaths)
as <- filtered_df%>%
  filter(Subgroup=='Asian, Non-Hispanic')%>%
  select(Maternal_Deaths)%>%
  rename('Asian_Non-Hispanic'=Maternal_Deaths)
aina <- df%>%
  filter(Subgroup=='American Indian or Alaska Native, Non-Hispanic')%>%
  select(Maternal_Deaths)%>% 
  rename('American_Indian_or_Alaska_Native, Non-Hispanic'=Maternal_Deaths)
ethn_ts <- bind_cols(His,bl,wh,as,aina) 
ethn_ts[is.na(ethn_ts)] <- 5 #NA values are between 1 and 9 and left empty for privacy reasons, replaced with mean of 5
ethn_ts <- ts(ethn_ts, start=c(2019,1), frequency = 12)
plot.ts(ethn_ts, plot.type='single',main='Monthly Maternal Deaths by Ethnicity 2019-2024', ylab='Maternal Deaths',xlab='Time')

=======

>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
Code <<<<<<< HEAD
options(repos = c(CRAN = "https://cloud.r-project.org"))#specify cran mirror
if (!require("lme4")) install.packages("lme4")#if required to install
library(lme4)#initiate from library
=======
#Data Exploration Graphs
colors=c('yellow','orange', 'blue', 'red', 'green')
df_ethnicity <- deaths_df %>%
    filter(!is.na(Ethnicity)) %>%
    group_by(Ethnicity) %>%
    summarise(Maternal_Deaths = sum(Maternal_Deaths, na.rm = TRUE))
barplot(df_ethnicity$Maternal_Deaths,names.arg=c('Am In/Ala', 'Asian, NH','Black, NH','Hispanic','White,NH'),main="Maternal Mortality Totals 2019-2024 by Ethnicity",col=colors,xlab='Ethnicity',ylab='Maternal Deaths')
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1

Code <<<<<<< HEAD
options(repos = c(CRAN = "https://cloud.r-project.org"))#specify cran mirror
if (!require("performance")) install.packages("performance")#if required to install
library(performance)
=======
df_tot_year_ts<-ts(df_total_year$Maternal_Deaths, start = c(2019), frequency = 1)
plot.ts(df_tot_year_ts,main='Yearly Total Maternal Deaths 2019-2024',xlab='Year',ylab='Maternal Deaths',col='navy')
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1

Code <<<<<<< HEAD
options(repos = c(CRAN = "https://cloud.r-project.org"))#specify cran mirror
if (!require("glmmTMB")) install.packages("glmmTMB")#if required to install
library(glmmTMB)
=======
df_tot_year_rate <- deaths_df%>%
  filter(Group=='Total')%>%
  select(Maternal_Mortality_Rate)
df_tot_year_rate_ts<-ts(df_tot_year_rate$Maternal_Mortality_Rate, start = c(2019,1), frequency = 12)
plot.ts(df_tot_year_rate_ts,main='Monthly Total Maternal Mortality Rate 2019-2024',xlab='Year',ylab='Maternal Deaths per 100,000 Live Births',col='navy')
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1

Code
total_df <- deaths_df%>%
  filter(Group=='Total')
total_ts <- ts(total_df$Maternal_Deaths,start=c(2019,1),frequency=12)
plot.ts(total_ts,main='Monthly Total Maternal Deaths 2019-2024',ylab='Maternal Deaths',xlab='Time',col='navy')  

Code
His <- filtered_df%>%
  filter(Subgroup=='Hispanic')%>%
  select(Maternal_Deaths)%>%
  rename(Hispanic=Maternal_Deaths)
bl <- filtered_df%>%
  filter(Subgroup=='Black, Non-Hispanic')%>%
  select(Maternal_Deaths)%>%
  rename('Black_Non-Hispanic' = Maternal_Deaths)
wh <- filtered_df%>%
  filter(Subgroup=='White, Non-Hispanic')%>%
  select(Maternal_Deaths)%>%
  rename('White_Non-Hispanic'=Maternal_Deaths)
as <- filtered_df%>%
  filter(Subgroup=='Asian, Non-Hispanic')%>%
  select(Maternal_Deaths)%>%
  rename('Asian_Non-Hispanic'=Maternal_Deaths)
aina <- df%>%
  filter(Subgroup=='American Indian or Alaska Native, Non-Hispanic')%>%
  select(Maternal_Deaths)%>% 
  rename('American_Indian_or_Alaska_Native, Non-Hispanic'=Maternal_Deaths)
ethn_ts <- bind_cols(His,bl,wh,as,aina) 
ethn_ts <- ts(ethn_ts, start=c(2019,1), frequency = 12)
colors=c('red','blue','green','orange','yellow')
plot.ts(ethn_ts, plot.type='single',main='Monthly Maternal Deaths by Ethnicity 2019-2024', ylab='Maternal Deaths',xlab='Time',col=colors)
legend("topleft",                     
       legend = c("Hispanic", "Black, Non-Hispanic", "White, Non-Hispanic",'Asian, Non-Hispanic','American Indian, Alaska Native, Non-Hisp'),  # Labels
       col = colors,                   
       lty = 1,                        
       lwd = 2,                     
       bty = "n",
       cex = 0.8)                      

Code
HisR <- filtered_df%>%
  filter(Subgroup=='Hispanic')%>%
  select(Maternal_Mortality_Rate)%>%
  rename(Hispanic=Maternal_Mortality_Rate)
blR <- filtered_df%>%
  filter(Subgroup=='Black, Non-Hispanic')%>%
  select(Maternal_Mortality_Rate)%>%
  rename('Black_Non-Hispanic' = Maternal_Mortality_Rate)
whR <- filtered_df%>%
  filter(Subgroup=='White, Non-Hispanic')%>%
  select(Maternal_Mortality_Rate)%>%
  rename('White_Non-Hispanic'=Maternal_Mortality_Rate)
asR <- filtered_df%>%
  filter(Subgroup=='Asian, Non-Hispanic')%>%
  select(Maternal_Mortality_Rate)%>%
  rename('Asian_Non-Hispanic'=Maternal_Mortality_Rate)
ainaR <- df%>%
  filter(Subgroup=='American Indian or Alaska Native, Non-Hispanic')%>%
  select(Maternal_Mortality_Rate)%>% 
  rename('American_Indian_or_Alaska_Native, Non-Hispanic'=Maternal_Mortality_Rate)
ethnR_ts <- bind_cols(HisR,blR,whR,asR,ainaR) 
ethnR_ts <- ts(ethnR_ts, start=c(2019,1), frequency = 12)
colors=c('red','blue','green','orange','yellow')
plot.ts(ethnR_ts, plot.type='single',main='Monthly Maternal Mortality Rate by Ethnicity 2019-2024', ylab='Maternal Deaths per 100,000 Live Births',xlab='Time',col=colors)
legend("topleft",                     
       legend = c("Hispanic", "Black, Non-Hispanic", "White, Non-Hispanic",'Asian, Non-Hispanic','American Indian, Alaska Native, Non-Hisp'),  # Labels
       col = colors,                   
       lty = 1,                        
       lwd = 2,                     
       bty = "n",
       cex = 0.7)            

Code
ageu25 <- deaths_df%>%
  filter(Subgroup=='Under 25 years')%>%
  select(Maternal_Deaths)%>%
  rename('Under_25_years'=Maternal_Deaths)
age25 <- filtered_df%>%
  filter(Subgroup=='25-39 years')%>%
  select(Maternal_Deaths)%>%
  rename('25-39_years' = Maternal_Deaths)
ageo40 <- filtered_df%>%
  filter(Subgroup=='40 years and over')%>%
  select(Maternal_Deaths)%>%
  rename('40_years_and_over'=Maternal_Deaths)
age_ts <- bind_cols(ageu25,age25,ageo40) 
age_ts <- ts(age_ts, start=c(2019,1), frequency = 12)
colors=c('red','blue','green')
plot.ts(age_ts, plot.type='single',main='Monthly Maternal Deaths by Age Group 2019-2024', ylab='Maternal Deaths',xlab='Time',col=colors)
legend("topleft",                     
       legend = c("Under 25 Years", "25-39 Years", "40 Years and over"),  # Labels
       col = colors,                   
       lty = 1,                        
       lwd = 2,                     
       bty = "n",
       cex = 0.8) 

Code
ageu25R <- deaths_df%>%
  filter(Subgroup=='Under 25 years')%>%
  select(Maternal_Mortality_Rate)%>%
  rename('Under_25_years'=Maternal_Mortality_Rate)
age25R <- filtered_df%>%
  filter(Subgroup=='25-39 years')%>%
  select(Maternal_Mortality_Rate)%>%
  rename('25-39_years' = Maternal_Mortality_Rate)
ageo40R <- filtered_df%>%
  filter(Subgroup=='40 years and over')%>%
  select(Maternal_Mortality_Rate)%>%
  rename('40_years_and_over'=Maternal_Mortality_Rate)
ageR_ts <- bind_cols(ageu25R,age25R,ageo40R) 
ageR_ts <- ts(ageR_ts, start=c(2019,1), frequency = 12)
colors=c('red','blue','green')
plot.ts(ageR_ts, plot.type='single',main='Monthly Maternal Mortality Rate by Age Group 2019-2024', ylab='Maternal Deaths per 100,000 live Births',xlab='Time',col=colors)
legend("topleft",                     
       legend = c("Under 25 Years", "25-39 Years", "40 Years and over"),  # Labels
       col = colors,                   
       lty = 1,                        
       lwd = 2,                     
       bty = "n",
       cex = 0.8) 

Code <<<<<<< HEAD
poissonmodel_glmm_all <- glmer(Maternal_Deaths ~ Ethnicity +Age_Group+ Dobbs_Era+(1|Year),
                              offset=log(Live_Births),
                              family = poisson(link = "log"), 
                              data = deaths_df3)
poissonmodel_glmm_all
=======
plot_histogram(deaths_df_ethnic$Maternal_Deaths,title='Histogram of Maternal Deaths - Ethnicity')

Code
plot_histogram(deaths_df_age$Maternal_Deaths, title='Histogram of Maternal Deaths - Age')

Code
plot_histogram(total_df$Maternal_Deaths,title='Histogram of Maternal Deaths - Total')

Code
#check overdispersion by for ethnicity data
deaths_df_ethnic %>% summarize(mean(Maternal_Deaths),
var(Maternal_Deaths))
  mean(Maternal_Deaths) var(Maternal_Deaths)
1              191.3046             17438.19
Code
#check overdispersion for age data
deaths_df_age %>% summarize(mean(Maternal_Deaths),var(Maternal_Deaths))
  mean(Maternal_Deaths) var(Maternal_Deaths)
1              276.2407              54102.1
Code
#check overdispersion for total data
total_df %>% summarize(mean(Maternal_Deaths),
var(Maternal_Deaths))
  mean(Maternal_Deaths) var(Maternal_Deaths)
1              828.8889             30819.62
Code
poissonmodel_glmm_ethnicity <- glmer(Maternal_Deaths ~ Ethnicity  + (1 | Month/Year), 
                              offset = log(Live_Births),
                              family = poisson(link = "log"), 
                              data = deaths_df)
poissonmodel_glmm_ethnicity
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: poisson  ( log )
Formula: Maternal_Deaths ~ Ethnicity + Age_Group + Dobbs_Era + (1 | Year)
   Data: deaths_df3
 Offset: log(Live_Births)
      AIC       BIC    logLik -2*log(L)  df.resid 
 5052.509  5095.406 -2516.254  5032.509       529 
Random effects:
 Groups Name        Std.Dev.
 Year   (Intercept) 0.1639  
Number of obs: 539, groups:  Year, 7
Fixed Effects:
                                            (Intercept)  
                                               -8.79157  
                           EthnicityBlack, Non-Hispanic  
                                                1.29455  
                           EthnicityWhite, Non-Hispanic  
                                                0.26912  
                                      EthnicityHispanic  
                                                0.14808  
EthnicityAmerican Indian or Alaska Native, Non-Hispanic  
                                                1.63899  
                                       EthnicityUnknown  
                                                0.02359  
                                   Age_Group25-39 years  
                                                0.39294  
                             Age_Group40 years and over  
                                                1.81014  
                                    Dobbs_EraPost-Dobbs  
                                               -0.22582  
fit warnings:
fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
Code <<<<<<< HEAD
check_overdispersion(poissonmodel_glmm_all)
=======
jack<-deaths_df2%>%filter(Group != "Total" & Subgroup != "Total")
Code
poissonmodel_glmm_both <- glmer(Maternal_Deaths ~ Subgroup+ (1|Month/Year),
                              offset=log(Live_Births),
                              family = poisson(link = "log"), 
                              data = jack)
poissonmodel_glmm_both
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: poisson  ( log )
Formula: Maternal_Deaths ~ Subgroup + (1 | Month/Year)
   Data: jack
 Offset: log(Live_Births)
      AIC       BIC    logLik -2*log(L)  df.resid 
 4606.684  4649.184 -2293.342  4586.684       508 
Random effects:
 Groups     Name        Std.Dev.
 Year:Month (Intercept) 0.1925  
 Month      (Intercept) 0.0000  
Number of obs: 518, groups:  Year:Month, 72; Month, 12
Fixed Effects:
                                           (Intercept)  
                                               -8.4596  
                             Subgroup40 years and over  
                                                1.4269  
SubgroupAmerican Indian or Alaska Native, Non-Hispanic  
                                                1.1756  
                           SubgroupAsian, Non-Hispanic  
                                               -0.4335  
                           SubgroupBlack, Non-Hispanic  
                                                0.8776  
                                      SubgroupHispanic  
                                               -0.2603  
                                SubgroupUnder 25 years  
                                               -0.3922  
                           SubgroupWhite, Non-Hispanic  
                                               -0.1480  
optimizer (Nelder_Mead) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings 
Code
poissonmodel_glmm_ethnicity_nooffset <- glmer(Maternal_Deaths ~ Ethnicity + (1 |Month/ Year), 
                              family = poisson(link = "log"), 
                              data = deaths_df)
poissonmodel_glmm_ethnicity_nooffset
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: poisson  ( log )
Formula: Maternal_Deaths ~ Ethnicity + (1 | Month/Year)
   Data: deaths_df
      AIC       BIC    logLik -2*log(L)  df.resid 
 2783.250  2809.223 -1384.625  2769.250       295 
Random effects:
 Groups     Name        Std.Dev. 
 Year:Month (Intercept) 1.881e-01
 Month      (Intercept) 8.126e-05
Number of obs: 302, groups:  Year:Month, 72; Month, 12
Fixed Effects:
                 (Intercept)  EthnicityAsian, Non-Hispanic  
                      2.8909                        0.5271  
EthnicityBlack, Non-Hispanic             EthnicityHispanic  
                      2.6865                        2.1079  
EthnicityWhite, Non-Hispanic  
                      2.9372  
optimizer (Nelder_Mead) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings 
Code
poissonmodel_glmm_both_nooffset <- glmer(Maternal_Deaths ~ Subgroup + (1|Month/Year),#includes ethncity and age
                              family = poisson(link = "log"), 
                              data = jack)
poissonmodel_glmm_both_nooffset
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: poisson  ( log )
Formula: Maternal_Deaths ~ Subgroup + (1 | Month/Year)
   Data: jack
      AIC       BIC    logLik -2*log(L)  df.resid 
 4565.783  4608.283 -2272.891  4545.783       508 
Random effects:
 Groups     Name        Std.Dev.
 Year:Month (Intercept) 0.1905  
 Month      (Intercept) 0.0000  
Number of obs: 518, groups:  Year:Month, 72; Month, 12
Fixed Effects:
                                           (Intercept)  
                                                6.3553  
                             Subgroup40 years and over  
                                               -1.5551  
SubgroupAmerican Indian or Alaska Native, Non-Hispanic  
                                               -3.4717  
                           SubgroupAsian, Non-Hispanic  
                                               -2.9378  
                           SubgroupBlack, Non-Hispanic  
                                               -0.7785  
                                      SubgroupHispanic  
                                               -1.3570  
                                SubgroupUnder 25 years  
                                               -1.6026  
                           SubgroupWhite, Non-Hispanic  
                                               -0.5277  
optimizer (Nelder_Mead) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings 
Code
check_overdispersion(poissonmodel_glmm_ethnicity)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
# Overdispersion test

<<<<<<< HEAD
       dispersion ratio =    2.448
  Pearson's Chi-Squared = 1294.998
                p-value =  < 0.001
======= dispersion ratio = 1.386 Pearson's Chi-Squared = 408.991 p-value = < 0.001
Code
check_overdispersion(poissonmodel_glmm_ethnicity_nooffset)
# Overdispersion test

       dispersion ratio =   1.454
  Pearson's Chi-Squared = 428.875
                p-value = < 0.001
Code
check_overdispersion(poissonmodel_glmm_both)
# Overdispersion test

       dispersion ratio =   1.299
  Pearson's Chi-Squared = 659.928
                p-value = < 0.001
Code
check_overdispersion(poissonmodel_glmm_both_nooffset)
# Overdispersion test

       dispersion ratio =   1.205
  Pearson's Chi-Squared = 612.172
                p-value =   0.001
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1

##overdispersion still detected

Code <<<<<<< HEAD
##with offset # all fixed effects
all_glmmodel_nb <- glmmTMB(
  Maternal_Deaths ~  Ethnicity + Age_Group +Dobbs_Era+ (1|Year),
  offset=log(Live_Births),
  family = nbinom2,
  data = deaths_df3
)
summary(all_glmmodel_nb)
=======
##with offset
ethnicityonly_glmmodel_nb <- glmmTMB(
  Maternal_Deaths ~  Ethnicity + (1 |Month/Year),
  offset=log(Live_Births),
  family = nbinom2,
  data = deaths_df
)
summary(ethnicityonly_glmmodel_nb)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
 Family: nbinom2  ( log )
Formula:          
Maternal_Deaths ~ Ethnicity + Age_Group + Dobbs_Era + (1 | Year)
Data: deaths_df3
 Offset: log(Live_Births)

      AIC       BIC    logLik -2*log(L)  df.resid 
   4749.5    4796.7   -2363.8    4727.5       528 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 Year   (Intercept) 0.02772  0.1665  
Number of obs: 539, groups:  Year, 7

<<<<<<< HEAD
Dispersion parameter for nbinom2 family ():  152 
=======
Dispersion parameter for nbinom2 family ():  217 

Conditional model:
                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                  -7.26818    0.06197 -117.29  < 2e-16 ***
EthnicityAsian, Non-Hispanic -1.62191    0.06199  -26.16  < 2e-16 ***
EthnicityBlack, Non-Hispanic -0.30679    0.05879   -5.22  1.8e-07 ***
EthnicityHispanic            -1.45856    0.05911  -24.68  < 2e-16 ***
EthnicityWhite, Non-Hispanic -1.33834    0.05869  -22.80  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
both_glmmodel_nb <- glmmTMB(
  Maternal_Deaths ~  Subgroup+  (1 |Month/Year),
  offset=log(Live_Births),
  family = nbinom2,
  data = jack
)
summary(both_glmmodel_nb)
 Family: nbinom2  ( log )
Formula:          Maternal_Deaths ~ Subgroup + (1 | Month/Year)
Data: jack
 Offset: log(Live_Births)

      AIC       BIC    logLik -2*log(L)  df.resid 
   4591.0    4637.7   -2284.5    4569.0       507 

Random effects:

Conditional model:
 Groups     Name        Variance  Std.Dev. 
 Year:Month (Intercept) 3.724e-02 1.930e-01
 Month      (Intercept) 2.118e-10 1.455e-05
Number of obs: 518, groups:  Year:Month, 72; Month, 12

Dispersion parameter for nbinom2 family ():  556 

Conditional model:
                                                       Estimate Std. Error
(Intercept)                                            -8.45935    0.02380
Subgroup40 years and over                               1.42486    0.01369
SubgroupAmerican Indian or Alaska Native, Non-Hispanic  1.17596    0.05558
SubgroupAsian, Non-Hispanic                            -0.43285    0.02284
SubgroupBlack, Non-Hispanic                             0.88117    0.01126
SubgroupHispanic                                       -0.26481    0.01296
SubgroupUnder 25 years                                 -0.39048    0.01388
SubgroupWhite, Non-Hispanic                            -0.14833    0.01072
                                                       z value Pr(>|z|)    
(Intercept)                                             -355.4   <2e-16 ***
Subgroup40 years and over                                104.1   <2e-16 ***
SubgroupAmerican Indian or Alaska Native, Non-Hispanic    21.2   <2e-16 ***
SubgroupAsian, Non-Hispanic                              -18.9   <2e-16 ***
SubgroupBlack, Non-Hispanic                               78.2   <2e-16 ***
SubgroupHispanic                                         -20.4   <2e-16 ***
SubgroupUnder 25 years                                   -28.1   <2e-16 ***
SubgroupWhite, Non-Hispanic                              -13.8   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
###Without offset, Ethnicity as fixed effect, Year as random effect #no offset
ethnicityonly_glmmodel_nb2 <- glmmTMB(
  Maternal_Deaths ~  Ethnicity + (1 |Month/Year),
  family = nbinom2,
  data = deaths_df
)
summary(ethnicityonly_glmmodel_nb2)
 Family: nbinom2  ( log )
Formula:          Maternal_Deaths ~ Ethnicity + (1 | Month/Year)
Data: deaths_df

      AIC       BIC    logLik -2*log(L)  df.resid 
   2732.9    2762.6   -1358.4    2716.9       294 

Random effects:

Conditional model:
 Groups     Name        Variance  Std.Dev. 
 Year:Month (Intercept) 3.366e-02 1.835e-01
 Month      (Intercept) 1.702e-10 1.305e-05
Number of obs: 302, groups:  Year:Month, 72; Month, 12

Dispersion parameter for nbinom2 family ():  193 

Conditional model:
                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)                   2.90488    0.06226   46.66   <2e-16 ***
EthnicityAsian, Non-Hispanic  0.51728    0.06249    8.28   <2e-16 ***
EthnicityBlack, Non-Hispanic  2.67788    0.05935   45.12   <2e-16 ***
EthnicityHispanic             2.08779    0.05966   35.00   <2e-16 ***
EthnicityWhite, Non-Hispanic  2.92331    0.05925   49.34   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
#without offset
both_glmmodel_nb2 <- glmmTMB(
  Maternal_Deaths ~  Subgroup + (1 |Month/Year),
  family = nbinom2,
  data = jack
)
summary(both_glmmodel_nb2)
 Family: nbinom2  ( log )
Formula:          Maternal_Deaths ~ Subgroup + (1 | Month/Year)
Data: jack

      AIC       BIC    logLik -2*log(L)  df.resid 
   4560.7    4607.4   -2269.3    4538.7       507 

Random effects:

Conditional model:
 Groups     Name        Variance  Std.Dev. 
 Year:Month (Intercept) 3.613e-02 1.901e-01
 Month      (Intercept) 2.443e-10 1.563e-05
Number of obs: 518, groups:  Year:Month, 72; Month, 12

Dispersion parameter for nbinom2 family ():  916 
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1

Conditional model:
                                                        Estimate Std. Error
(Intercept)                                             -8.79009    0.06789
EthnicityBlack, Non-Hispanic                             1.29873    0.02559
EthnicityWhite, Non-Hispanic                             0.26615    0.02537
EthnicityHispanic                                        0.13590    0.02637
EthnicityAmerican Indian or Alaska Native, Non-Hispanic  1.63322    0.06254
EthnicityUnknown                                         0.02565    0.02683
Age_Group25-39 years                                     0.38823    0.01777
Age_Group40 years and over                               1.80017    0.02009
Age_GroupUnknown                                              NA         NA
Dobbs_EraPost-Dobbs                                     -0.22146    0.02273
                                                        z value Pr(>|z|)    
(Intercept)                                             -129.48  < 2e-16 ***
EthnicityBlack, Non-Hispanic                              50.74  < 2e-16 ***
EthnicityWhite, Non-Hispanic                              10.49  < 2e-16 ***
EthnicityHispanic                                          5.15 2.55e-07 ***
EthnicityAmerican Indian or Alaska Native, Non-Hispanic   26.12  < 2e-16 ***
EthnicityUnknown                                           0.96    0.339    
Age_Group25-39 years                                      21.85  < 2e-16 ***
Age_Group40 years and over                                89.61  < 2e-16 ***
Age_GroupUnknown                                             NA       NA    
Dobbs_EraPost-Dobbs                                       -9.74  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code <<<<<<< HEAD
ethnicity_agegroup_glmmodel_nb <- glmmTMB(
  Maternal_Deaths ~  Ethnicity+ Age_Group + (1|Year),
  offset=log(Live_Births),
  family = nbinom2,
  data = deaths_df3
)
summary(ethnicity_agegroup_glmmodel_nb)
=======
glmmodel_nb3 <- glmmTMB(
  Maternal_Deaths ~  Ethnicity +(1|Month/Year),
  family = nbinom2,
  data = deaths_df
)
summary(glmmodel_nb3)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
 Family: nbinom2  ( log )
Formula:          Maternal_Deaths ~ Ethnicity + Age_Group + (1 | Year)
Data: deaths_df3
 Offset: log(Live_Births)

      AIC       BIC    logLik -2*log(L)  df.resid 
   4832.1    4875.0   -2406.1    4812.1       529 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 Year   (Intercept) 0.03328  0.1824  
Number of obs: 539, groups:  Year, 7

Dispersion parameter for nbinom2 family ():  109 

Conditional model:
                                                        Estimate Std. Error
(Intercept)                                             -8.89866    0.07290
EthnicityBlack, Non-Hispanic                             1.29879    0.02692
EthnicityWhite, Non-Hispanic                             0.26522    0.02671
EthnicityHispanic                                        0.13422    0.02766
EthnicityAmerican Indian or Alaska Native, Non-Hispanic  1.66405    0.06424
EthnicityUnknown                                         0.02558    0.02810
Age_Group25-39 years                                     0.38781    0.01963
Age_Group40 years and over                               1.79865    0.02175
Age_GroupUnknown                                              NA         NA
                                                        z value Pr(>|z|)    
(Intercept)                                             -122.07  < 2e-16 ***
EthnicityBlack, Non-Hispanic                              48.24  < 2e-16 ***
EthnicityWhite, Non-Hispanic                               9.93  < 2e-16 ***
EthnicityHispanic                                          4.85 1.22e-06 ***
EthnicityAmerican Indian or Alaska Native, Non-Hispanic   25.90  < 2e-16 ***
EthnicityUnknown                                           0.91    0.363    
Age_Group25-39 years                                      19.76  < 2e-16 ***
Age_Group40 years and over                                82.68  < 2e-16 ***
Age_GroupUnknown                                             NA       NA    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code <<<<<<< HEAD
ethnicity_year_glmmodel_nb <- glmmTMB(
  Maternal_Deaths ~  Ethnicity + Dobbs_Era +  (1 |Year),
  offset=log(Live_Births),
  family = nbinom2,
  data = deaths_df3
)
summary(ethnicity_year_glmmodel_nb)
 Family: nbinom2  ( log )
Formula:          Maternal_Deaths ~ Ethnicity + Dobbs_Era + (1 | Year)
Data: deaths_df3
 Offset: log(Live_Births)

      AIC       BIC    logLik -2*log(L)  df.resid 
   6336.8    6375.4   -3159.4    6318.8       530 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 Year   (Intercept) 0.0268   0.1637  
Number of obs: 539, groups:  Year, 7

Dispersion parameter for nbinom2 family (): 3.83 

Conditional model:
                                                        Estimate Std. Error
(Intercept)                                             -8.78230    0.09801
EthnicityBlack, Non-Hispanic                             1.29287    0.08642
EthnicityWhite, Non-Hispanic                             0.25570    0.08634
EthnicityHispanic                                        0.11896    0.08676
EthnicityAmerican Indian or Alaska Native, Non-Hispanic  1.63572    0.16428
EthnicityUnknown                                         1.03437    0.07184
Dobbs_EraPost-Dobbs                                     -0.20940    0.08673
                                                        z value Pr(>|z|)    
(Intercept)                                              -89.60  < 2e-16 ***
EthnicityBlack, Non-Hispanic                              14.96  < 2e-16 ***
EthnicityWhite, Non-Hispanic                               2.96  0.00306 ** 
EthnicityHispanic                                          1.37  0.17036    
EthnicityAmerican Indian or Alaska Native, Non-Hispanic    9.96  < 2e-16 ***
EthnicityUnknown                                          14.40  < 2e-16 ***
Dobbs_EraPost-Dobbs                                       -2.41  0.01576 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
ethnicity_glmmodel_nb <- glmmTMB(
  Maternal_Deaths ~  Ethnicity  +  (1 |Year),
  offset=log(Live_Births),
  family = nbinom2,
  data = deaths_df3
)
summary(ethnicity_glmmodel_nb)
 Family: nbinom2  ( log )
Formula:          Maternal_Deaths ~ Ethnicity + (1 | Year)
Data: deaths_df3
 Offset: log(Live_Births)

      AIC       BIC    logLik -2*log(L)  df.resid 
   6340.4    6374.7   -3162.2    6324.4       531 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 Year   (Intercept) 0.03409  0.1846  
Number of obs: 539, groups:  Year, 7

Dispersion parameter for nbinom2 family ():  3.8 

Conditional model:
                                                        Estimate Std. Error
(Intercept)                                             -8.88365    0.09443
EthnicityBlack, Non-Hispanic                             1.29300    0.08672
EthnicityWhite, Non-Hispanic                             0.25617    0.08665
EthnicityHispanic                                        0.11995    0.08706
EthnicityAmerican Indian or Alaska Native, Non-Hispanic  1.66268    0.16416
EthnicityUnknown                                         1.03544    0.07208
                                                        z value Pr(>|z|)    
(Intercept)                                              -94.08  < 2e-16 ***
EthnicityBlack, Non-Hispanic                              14.91  < 2e-16 ***
EthnicityWhite, Non-Hispanic                               2.96  0.00311 ** 
EthnicityHispanic                                          1.38  0.16827    
EthnicityAmerican Indian or Alaska Native, Non-Hispanic   10.13  < 2e-16 ***
EthnicityUnknown                                          14.36  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
year_glmmodel_nb <- glmmTMB(
  Maternal_Deaths ~  Dobbs_Era +  (1 |Year),
  offset=log(Live_Births),
  family = nbinom2,
  data = deaths_df3
)
summary(year_glmmodel_nb)
 Family: nbinom2  ( log )
Formula:          Maternal_Deaths ~ Dobbs_Era + (1 | Year)
Data: deaths_df3
 Offset: log(Live_Births)

      AIC       BIC    logLik -2*log(L)  df.resid 
   6680.4    6697.5   -3336.2    6672.4       535 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 Year   (Intercept) 0.03925  0.1981  
Number of obs: 539, groups:  Year, 7

Dispersion parameter for nbinom2 family (): 2.12 

Conditional model:
                    Estimate Std. Error z value Pr(>|z|)    
(Intercept)         -7.93118    0.09752  -81.33   <2e-16 ***
Dobbs_EraPost-Dobbs -0.26902    0.11209   -2.40   0.0164 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
##with no offset # all fixed effects
allno_glmmodel_nb <- glmmTMB(
  Maternal_Deaths ~  Ethnicity + Age_Group+ Dobbs_Era + (1 |Year),
  family = nbinom2,
  data = deaths_df3
)
summary(allno_glmmodel_nb)
 Family: nbinom2  ( log )
Formula:          
Maternal_Deaths ~ Ethnicity + Age_Group + Dobbs_Era + (1 | Year)
Data: deaths_df3

      AIC       BIC    logLik -2*log(L)  df.resid 
   4729.4    4776.6   -2353.7    4707.4       528 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 Year   (Intercept) 0.02563  0.1601  
Number of obs: 539, groups:  Year, 7

Dispersion parameter for nbinom2 family ():  166 

Conditional model:
                                                        Estimate Std. Error
(Intercept)                                              3.52242    0.06553
EthnicityBlack, Non-Hispanic                             2.13928    0.02530
EthnicityWhite, Non-Hispanic                             2.38642    0.02508
EthnicityHispanic                                        1.54559    0.02609
EthnicityAmerican Indian or Alaska Native, Non-Hispanic -0.50652    0.06213
EthnicityUnknown                                         1.31399    0.02655
Age_Group25-39 years                                     1.60176    0.01733
Age_Group40 years and over                               0.03455    0.01971
Age_GroupUnknown                                              NA         NA
Dobbs_EraPost-Dobbs                                     -0.22316    0.02208
                                                        z value Pr(>|z|)    
(Intercept)                                               53.75  < 2e-16 ***
EthnicityBlack, Non-Hispanic                              84.55  < 2e-16 ***
EthnicityWhite, Non-Hispanic                              95.16  < 2e-16 ***
EthnicityHispanic                                         59.24  < 2e-16 ***
EthnicityAmerican Indian or Alaska Native, Non-Hispanic   -8.15 3.56e-16 ***
EthnicityUnknown                                          49.50  < 2e-16 ***
Age_Group25-39 years                                      92.41  < 2e-16 ***
Age_Group40 years and over                                 1.75   0.0795 .  
Age_GroupUnknown                                             NA       NA    
Dobbs_EraPost-Dobbs                                      -10.11  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
ethnicity_agegroupno_glmmodel_nb <- glmmTMB(
  Maternal_Deaths ~  Ethnicity+ Age_Group + (1 |Year),
  family = nbinom2,
  data = deaths_df3
)
summary(ethnicity_agegroupno_glmmodel_nb)
 Family: nbinom2  ( log )
Formula:          Maternal_Deaths ~ Ethnicity + Age_Group + (1 | Year)
Data: deaths_df3

      AIC       BIC    logLik -2*log(L)  df.resid 
   4817.4    4860.3   -2398.7    4797.4       529 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 Year   (Intercept) 0.03222  0.1795  
Number of obs: 539, groups:  Year, 7

Dispersion parameter for nbinom2 family ():  115 

Conditional model:
                                                        Estimate Std. Error
(Intercept)                                              3.41360    0.07181
EthnicityBlack, Non-Hispanic                             2.13864    0.02667
EthnicityWhite, Non-Hispanic                             2.38503    0.02646
EthnicityHispanic                                        1.54345    0.02742
EthnicityAmerican Indian or Alaska Native, Non-Hispanic -0.47541    0.06389
EthnicityUnknown                                         1.31327    0.02786
Age_Group25-39 years                                     1.60149    0.01926
Age_Group40 years and over                               0.03291    0.02143
Age_GroupUnknown                                              NA         NA
                                                        z value Pr(>|z|)    
(Intercept)                                               47.54  < 2e-16 ***
EthnicityBlack, Non-Hispanic                              80.19  < 2e-16 ***
EthnicityWhite, Non-Hispanic                              90.15  < 2e-16 ***
EthnicityHispanic                                         56.29  < 2e-16 ***
EthnicityAmerican Indian or Alaska Native, Non-Hispanic   -7.44 9.95e-14 ***
EthnicityUnknown                                          47.14  < 2e-16 ***
Age_Group25-39 years                                      83.13  < 2e-16 ***
Age_Group40 years and over                                 1.54    0.125    
Age_GroupUnknown                                             NA       NA    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
ethnicity_yearno_glmmodel_nb <- glmmTMB(
  Maternal_Deaths ~  Ethnicity + Dobbs_Era +  (1 |Year),
  family = nbinom2,
  data = deaths_df3
)
summary(ethnicity_yearno_glmmodel_nb)
 Family: nbinom2  ( log )
Formula:          Maternal_Deaths ~ Ethnicity + Dobbs_Era + (1 | Year)
Data: deaths_df3

      AIC       BIC    logLik -2*log(L)  df.resid 
   6326.1    6364.7   -3154.0    6308.1       530 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 Year   (Intercept) 0.01976  0.1406  
Number of obs: 539, groups:  Year, 7

Dispersion parameter for nbinom2 family (): 3.84 

Conditional model:
                                                        Estimate Std. Error
(Intercept)                                              3.52969    0.09154
EthnicityBlack, Non-Hispanic                             2.12758    0.08628
EthnicityWhite, Non-Hispanic                             2.37249    0.08621
EthnicityHispanic                                        1.52862    0.08663
EthnicityAmerican Indian or Alaska Native, Non-Hispanic -0.47021    0.16403
EthnicityUnknown                                         2.14744    0.07139
Dobbs_EraPost-Dobbs                                     -0.19502    0.08237
                                                        z value Pr(>|z|)    
(Intercept)                                               38.56  < 2e-16 ***
EthnicityBlack, Non-Hispanic                              24.66  < 2e-16 ***
EthnicityWhite, Non-Hispanic                              27.52  < 2e-16 ***
EthnicityHispanic                                         17.65  < 2e-16 ***
EthnicityAmerican Indian or Alaska Native, Non-Hispanic   -2.87  0.00415 ** 
EthnicityUnknown                                          30.08  < 2e-16 ***
Dobbs_EraPost-Dobbs                                       -2.37  0.01791 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
ethnicityno_glmmodel_nb <- glmmTMB(
  Maternal_Deaths ~  Ethnicity  +  (1 |Year),
  family = nbinom2,
  data = deaths_df3
)
summary(ethnicityno_glmmodel_nb)
 Family: nbinom2  ( log )
Formula:          Maternal_Deaths ~ Ethnicity + (1 | Year)
Data: deaths_df3

      AIC       BIC    logLik -2*log(L)  df.resid 
   6329.6    6363.9   -3156.8    6313.6       531 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 Year   (Intercept) 0.02438  0.1562  
Number of obs: 539, groups:  Year, 7

Dispersion parameter for nbinom2 family (): 3.81 

Conditional model:
                                                        Estimate Std. Error
(Intercept)                                              3.43625    0.08665
EthnicityBlack, Non-Hispanic                             2.12821    0.08658
EthnicityWhite, Non-Hispanic                             2.37341    0.08652
EthnicityHispanic                                        1.52904    0.08693
EthnicityAmerican Indian or Alaska Native, Non-Hispanic -0.44427    0.16392
EthnicityUnknown                                         2.14815    0.07163
                                                        z value Pr(>|z|)    
(Intercept)                                               39.66  < 2e-16 ***
EthnicityBlack, Non-Hispanic                              24.58  < 2e-16 ***
EthnicityWhite, Non-Hispanic                              27.43  < 2e-16 ***
EthnicityHispanic                                         17.59  < 2e-16 ***
EthnicityAmerican Indian or Alaska Native, Non-Hispanic   -2.71  0.00672 ** 
EthnicityUnknown                                          29.99  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
yearno_glmmodel_nb <- glmmTMB(
  Maternal_Deaths ~  Dobbs_Era +  (1 |Year),
  family = nbinom2,
  data = deaths_df3
)
summary(yearno_glmmodel_nb)
 Family: nbinom2  ( log )
Formula:          Maternal_Deaths ~ Dobbs_Era + (1 | Year)
Data: deaths_df3

      AIC       BIC    logLik -2*log(L)  df.resid 
   6869.8    6886.9   -3430.9    6861.8       535 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 Year   (Intercept) 0.009802 0.09901 
Number of obs: 539, groups:  Year, 7

Dispersion parameter for nbinom2 family (): 1.53 

Conditional model:
                    Estimate Std. Error z value Pr(>|z|)    
(Intercept)          5.46647    0.06714   81.42   <2e-16 ***
Dobbs_EraPost-Dobbs -0.15281    0.09376   -1.63    0.103    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
performance::check_model(all_glmmodel_nb)#interesting
=======
performance::check_model(ethnicityonly_glmmodel_nb)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
<<<<<<< HEAD

=======

>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
Code <<<<<<< HEAD
performance::check_model(ethnicity_agegroup_glmmodel_nb)#interesting
=======
performance::check_model(both_glmmodel_nb)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
<<<<<<< HEAD

=======

>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
Code <<<<<<< HEAD
performance::check_model(ethnicity_year_glmmodel_nb)#no
=======
performance::check_model(ethnicityonly_glmmodel_nb2)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
<<<<<<< HEAD

=======

>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
Code <<<<<<< HEAD
performance::check_model(ethnicity_glmmodel_nb)#no
=======
performance::check_model(both_glmmodel_nb2)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
<<<<<<< HEAD

Code
performance::check_model(year_glmmodel_nb)##no

Code
performance::check_model(allno_glmmodel_nb)##interesting

Code
performance::check_model(ethnicity_agegroupno_glmmodel_nb)#interesting

Code
performance::check_model(ethnicity_yearno_glmmodel_nb)#no

Code
performance::check_model(ethnicityno_glmmodel_nb)# no

Code
performance::check_model(yearno_glmmodel_nb)#no

Code
#performance::check_model(all2_glmmodel_nb)#interesting
#performance::check_model(eage_glmmodel_nb)
#performance::check_model(allno2glmmmodel)#interesting
Code
performance::r2(all_glmmodel_nb)
# R2 for Mixed Models

  Conditional R2: 0.058
     Marginal R2: 0.055
Code
check_singularity(all_glmmodel_nb)
[1] FALSE
Code
performance::r2(ethnicity_agegroup_glmmodel_nb)
# R2 for Mixed Models

  Conditional R2: 0.057
     Marginal R2: 0.053
Code
check_singularity(ethnicity_agegroup_glmmodel_nb)
[1] FALSE
Code
performance::r2(allno_glmmodel_nb)
# R2 for Mixed Models

  Conditional R2: 0.988
     Marginal R2: 0.960
Code
check_singularity(allno_glmmodel_nb)
[1] FALSE
Code
performance::r2(ethnicity_agegroupno_glmmodel_nb)
# R2 for Mixed Models

  Conditional R2: 0.986
     Marginal R2: 0.950
Code
check_singularity(ethnicity_agegroupno_glmmodel_nb)
[1] FALSE
Code
resid_pearson <- residuals(all_glmmodel_nb, type = "pearson")
plot(resid_pearson ~ fitted(all_glmmodel_nb), main = "Pearson residuals vs Fitted")
abline(h = 0, col = "red")

=======

>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
Code <<<<<<< HEAD
overdisp_fun <- function(all_glmmodel_nb) {
  rdf <- df.residual(all_glmmodel_nb)
  rp <- residuals(all_glmmodel_nb, type = "pearson")
  Pearson.chisq <- sum(rp^2)
  prat <- Pearson.chisq / rdf
  pval <- pchisq(Pearson.chisq, df = rdf, lower.tail = FALSE)
  c(chisq = Pearson.chisq, ratio = prat, rdf = rdf, p = pval)
}

overdisp_fun(all_glmmodel_nb)
      chisq       ratio         rdf           p 
555.8777207   1.0527987 528.0000000   0.1938884 
Code
#ratio is around 1 which is good and not too indicative of overdispersion
#pp-value is greater that P<0.005 so that means there is no significant overdispersion
Code
arm::binnedplot(fitted(all_glmmodel_nb), 
           residuals(all_glmmodel_nb, type = "response"), 
           main = "Binned residual plot")
=======
#for non-normal error models
sim <- simulateResiduals(ethnicityonly_glmmodel_nb)
plot(sim)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
<<<<<<< HEAD

=======

>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
Code <<<<<<< HEAD
# fan shape  heteroskedasticity (non-constant variance)- coon when with count data
Code
ranef(all_glmmodel_nb)$cond
$Year
      (Intercept)
2019 -0.248444844
2020 -0.125183654
2021  0.137748386
2022  0.305373238
2023 -0.001826617
2024 -0.016851766
2025 -0.050947703
Code
library(ggplot2)


re <- ranef(all_glmmodel_nb)
cond_re <- re$cond
year_re <- as.data.frame(cond_re$Year)
year_re$Year <- rownames(year_re)

ggplot(year_re, aes(x = factor(Year), y = `(Intercept)`)) +
  geom_point(size = 3) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  labs(title = "Random Effect Intercepts by Year",
       x = "Year",
       y = "Random Effect Estimate (Intercept)") +
  theme_minimal()
=======
sim2 <- simulateResiduals(ethnicityonly_glmmodel_nb2)
plot(sim2)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1

Code <<<<<<< HEAD
#dashed zero is the overall random intercept
# points above zero - years with above average random effect
#points below zero- years below average random effect
=======
sim3 <- simulateResiduals(both_glmmodel_nb)
plot(sim3)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
Code
performance::check_collinearity(all_glmmodel_nb)
# Check for Multicollinearity

Low Correlation

      Term  VIF       VIF 95% CI adj. VIF Tolerance Tolerance 95% CI
 Ethnicity 2.57 [2.27,     2.95]     1.60      0.39     [0.34, 0.44]
 Age_Group 2.56 [2.26,     2.94]     1.60      0.39     [0.34, 0.44]
 Dobbs_Era 1.00 [1.00, 1.06e+07]     1.00      1.00     [0.00, 1.00]
Code <<<<<<< HEAD
performance::check_model(all_glmmodel_nb, verbose = TRUE, residual_type = "normal")
=======
sim4 <- simulateResiduals(both_glmmodel_nb2)
plot(sim4)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1

Code <<<<<<< HEAD
sjPlot::plot_model(all_glmmodel_nb, type = "re") #no extreme points with random effects

Code
sjPlot::tab_model(all_glmmodel_nb)
=======
sjPlot::tab_model(ethnicityonly_glmmodel_nb)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
<<<<<<< HEAD =======
  Maternal_Deaths
Predictors Incidence Rate Ratios CI p
(Intercept) 0.00 0.00 – 0.00 <0.001
Ethnicity [Black,
Non-Hispanic]
3.66 3.49 – 3.85 <0.001
Ethnicity [White,
Non-Hispanic]
1.30 1.24 – 1.37 <0.001
Ethnicity [Hispanic] 1.15 1.09 – 1.21 <0.001
Ethnicity [American
Indian or Alaska Native,
Non-Hispanic]
5.12 4.53 – 5.79 <0.001
Ethnicity [Unknown] 1.03 0.97 – 1.08 0.339
Age_Group25-39 years 1.47 1.42 – 1.53 <0.001
Age Group [40 years and
over]
6.05 5.82 – 6.29 <0.001
Dobbs Era [Post-Dobbs] 0.80 0.77 – 0.84 <0.001
Random Effects
σ2 8.03
τ00 Yearτ00 Year:Month 0.04
τ00 Month 0.00
N Year 6
N Month 12
Observations 302
Code
sjPlot::tab_model(ethnicityonly_glmmodel_nb2)#no offset
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
  Maternal Deaths
Predictors Incidence Rate Ratios CI p
(Intercept) 18.26 16.17 – 20.63 <0.001
Ethnicity [Asian,
Non-Hispanic]
1.68 1.48 – 1.90 <0.001
Ethnicity [Black,
Non-Hispanic]
14.55 12.96 – 16.35 <0.001
Ethnicity [Hispanic] 8.07 7.18 – 9.07 <0.001
Ethnicity [White,
Non-Hispanic]
18.60 16.56 – 20.89 <0.001
Random Effects
σ2
τ00 Year:Month0.03
ICC 0.00
N Year 7
Observations 539
Marginal R2 / Conditional R2 0.055 / 0.058
Code <<<<<<< HEAD
resid_pearson <- residuals(ethnicity_agegroup_glmmodel_nb, type = "pearson")
plot(resid_pearson ~ fitted(ethnicity_agegroup_glmmodel_nb), main = "Pearson residuals vs Fitted")
abline(h = 0, col = "red")

Code
overdisp_fun <- function(ethnicity_agegroup_glmmodel_nb) {
  rdf <- df.residual(ethnicity_agegroup_glmmodel_nb)
  rp <- residuals(ethnicity_agegroup_glmmodel_nb, type = "pearson")
  Pearson.chisq <- sum(rp^2)
  prat <- Pearson.chisq / rdf
  pval <- pchisq(Pearson.chisq, df = rdf, lower.tail = FALSE)
  c(chisq = Pearson.chisq, ratio = prat, rdf = rdf, p = pval)
}

overdisp_fun(ethnicity_agegroup_glmmodel_nb)
      chisq       ratio         rdf           p 
535.0503616   1.0114374 529.0000000   0.4184795 
Code
#ratio is around 1 which is good and not too indicative of overdispersion
#pp-value is greater that P<0.005 so that means there is no significant overdispersion
Code
arm::binnedplot(fitted(ethnicity_agegroup_glmmodel_nb), 
           residuals(ethnicity_agegroup_glmmodel_nb, type = "response"), 
           main = "Binned residual plot")

Code
# fan shape  heteroskedasticity (non-constant variance)- coon when with count data
Code
ranef(ethnicity_agegroup_glmmodel_nb)$cond
$Year
     (Intercept)
2019 -0.13942867
2020 -0.01799831
2021  0.24604207
2022  0.31072118
2023 -0.11605061
2024 -0.12503758
2025 -0.15849481
Code
library(ggplot2)


re <- ranef(ethnicity_agegroup_glmmodel_nb)
cond_re <- re$cond
year_re <- as.data.frame(cond_re$Year)
year_re$Year <- rownames(year_re)

ggplot(year_re, aes(x = factor(Year), y = `(Intercept)`)) +
  geom_point(size = 3) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  labs(title = "Random Effect Intercepts by Year",
       x = "Year",
       y = "Random Effect Estimate (Intercept)") +
  theme_minimal()

Code
#dashed zero is the overall random intercept
# points above zero - years with above average random effect
#points below zero- years below average random effect
Code
performance::check_collinearity(ethnicity_agegroup_glmmodel_nb)
# Check for Multicollinearity

Low Correlation

      Term  VIF   VIF 95% CI adj. VIF Tolerance Tolerance 95% CI
 Ethnicity 2.46 [2.17, 2.82]     1.57      0.41     [0.35, 0.46]
 Age_Group 2.46 [2.17, 2.82]     1.57      0.41     [0.35, 0.46]
Code
performance::check_model(ethnicity_agegroup_glmmodel_nb, verbose = TRUE, residual_type = "normal")

Code
sjPlot::plot_model(ethnicity_agegroup_glmmodel_nb, type = "re") #no extreme points with random effects

Code
sjPlot::tab_model(ethnicity_agegroup_glmmodel_nb)
=======
sjPlot::tab_model(both_glmmodel_nb)
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
<<<<<<< HEAD =======
  Maternal_Deaths
Predictors Incidence Rate Ratios CI p
(Intercept) 0.00 0.00 – 0.00 <0.001
Subgroup [40 years and
over]
4.16 4.05 – 4.27 <0.001
Subgroup [American Indian
or Alaska Native,
Non-Hispanic]
3.24 2.91 – 3.61 <0.001
Subgroup [Asian,
Non-Hispanic]
0.65 0.62 – 0.68 <0.001
Subgroup [Black,
Non-Hispanic]
2.41 2.36 – 2.47 <0.001
Subgroup [Hispanic] 0.77 0.75 – 0.79 <0.001
Subgroup [Under 25 years] 0.68 0.66 – 0.70 <0.001
Subgroup [White,
Non-Hispanic]
0.86 0.84 – 0.88 <0.001
Random Effects
σ2
τ00 Year:Month 0.04
τ00 Month 0.00
N Year 6
N Month 12
Observations 518
Code
sjPlot::tab_model(both_glmmodel_nb2)#no offset
  Maternal Deaths
Predictors Incidence Rate Ratios CI p
(Intercept) 575.57 549.92 – 602.42 <0.001
Subgroup [40 years and
over]
0.21 0.21 – 0.22 <0.001
Subgroup [American Indian
or Alaska Native,
Non-Hispanic]
0.03 0.03 – 0.03 <0.001
Subgroup [Asian,
Non-Hispanic]
0.05 0.05 – 0.06 <0.001
Subgroup [Black,
Non-Hispanic]
0.46 0.45 – 0.47 <0.001
Subgroup [Hispanic] 0.26 0.25 – 0.26 <0.001
Subgroup [Under 25 years] 0.20 0.20 – 0.21 <0.001
Subgroup [White,
Non-Hispanic]
0.59 0.58 – 0.60 <0.001
Random Effects
σ2
τ00 Year:Month 0.04
τ00 Month 0.00
N Year 6
N Month 12
Observations 518
Code
sjPlot::tab_model(glmmodel_nb3)#no offset
>>>>>>> 5ff3583d31d40c93aaa9005a06c34bf67126bfd1
  Maternal Deaths
Predictors Incidence Rate Ratios CI p
(Intercept) 18.26 16.17 – 20.63 <0.001
Ethnicity [Asian,
Non-Hispanic]
1.68 1.48 – 1.90 <0.001
Ethnicity [Black,
Non-Hispanic]
3.66 3.48 – 3.86 <0.001
Ethnicity [White,
Non-Hispanic]
1.30 1.24 – 1.37 <0.001
Ethnicity [Hispanic] 1.14 1.08 – 1.21 <0.001
Ethnicity [American
Indian or Alaska Native,
Non-Hispanic]
5.28 4.66 – 5.99 <0.001
Ethnicity [Unknown] 1.03 0.97 – 1.08 0.363
Age_Group25-39 years 1.47 1.42 – 1.53 <0.001
Age Group [40 years and
over]
6.04 5.79 – 6.30 <0.001
Random Effects
σ2 8.03
τ00 Year 0.03
ICC 0.00
N Year 7
Observations 539
Marginal R2 / Conditional R2 0.053 / 0.057
Code
resid_pearson <- residuals(allno_glmmodel_nb, type = "pearson")
plot(resid_pearson ~ fitted(allno_glmmodel_nb), main = "Pearson residuals vs Fitted")
abline(h = 0, col = "red")

Code
overdisp_fun <- function(allno_glmmodel_nb) {
  rdf <- df.residual(allno_glmmodel_nb)
  rp <- residuals(allno_glmmodel_nb, type = "pearson")
  Pearson.chisq <- sum(rp^2)
  prat <- Pearson.chisq / rdf
  pval <- pchisq(Pearson.chisq, df = rdf, lower.tail = FALSE)
  c(chisq = Pearson.chisq, ratio = prat, rdf = rdf, p = pval)
}

overdisp_fun(allno_glmmodel_nb)
      chisq       ratio         rdf           p 
558.2070602   1.0572103 528.0000000   0.1754561 
Code
#ratio is around 1 which is good and not too indicative of overdispersion
#pp-value is greater that P<0.005 so that means there is no significant overdispersion
Code
arm::binnedplot(fitted(allno_glmmodel_nb), 
           residuals(allno_glmmodel_nb, type = "response"), 
           main = "Binned residual plot")

Code
# fan shape  heteroskedasticity (non-constant variance)- coon when with count data
Code
ranef(allno_glmmodel_nb)$cond
$Year
      (Intercept)
2019 -0.223376485
2020 -0.115595028
2021  0.116746964
2022  0.310869243
2023 -0.001446765
2024 -0.026653810
2025 -0.060626691
Code
library(ggplot2)


re <- ranef(allno_glmmodel_nb)
cond_re <- re$cond
year_re <- as.data.frame(cond_re$Year)
year_re$Year <- rownames(year_re)

ggplot(year_re, aes(x = factor(Year), y = `(Intercept)`)) +
  geom_point(size = 3) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  labs(title = "Random Effect Intercepts by Year",
       x = "Year",
       y = "Random Effect Estimate (Intercept)") +
  theme_minimal()

Code
#dashed zero is the overall random intercept
# points above zero - years with above average random effect
#points below zero- years below average random effect
Code
performance::check_collinearity(allno_glmmodel_nb)
# Check for Multicollinearity

Low Correlation

      Term  VIF       VIF 95% CI adj. VIF Tolerance Tolerance 95% CI
 Ethnicity 2.60 [2.29,     2.99]     1.61      0.38     [0.33, 0.44]
 Age_Group 2.59 [2.28,     2.98]     1.61      0.39     [0.34, 0.44]
 Dobbs_Era 1.00 [1.00, 2.53e+07]     1.00      1.00     [0.00, 1.00]
Code
performance::check_model(allno_glmmodel_nb, verbose = TRUE, residual_type = "normal")

Code
sjPlot::tab_model(allno_glmmodel_nb)#no offset
  Maternal_Deaths
Predictors Incidence Rate Ratios CI p
(Intercept) 33.87 29.78 – 38.51 <0.001
Ethnicity [Black,
Non-Hispanic]
8.49 8.08 – 8.93 <0.001
Ethnicity [White,
Non-Hispanic]
10.87 10.35 – 11.42 <0.001
Ethnicity [Hispanic] 4.69 4.46 – 4.94 <0.001
Ethnicity [American
Indian or Alaska Native,
Non-Hispanic]
0.60 0.53 – 0.68 <0.001
Ethnicity [Unknown] 3.72 3.53 – 3.92 <0.001
Age_Group25-39 years 4.96 4.80 – 5.13 <0.001
Age Group [40 years and
over]
1.04 1.00 – 1.08 0.080
Dobbs Era [Post-Dobbs] 0.80 0.77 – 0.84 <0.001
Random Effects
σ2 0.01
τ00 Year 0.03
ICC 0.71
N Year 7
Observations 539
Marginal R2 / Conditional R2 0.960 / 0.988
Code
resid_pearson <- residuals(ethnicity_agegroupno_glmmodel_nb, type = "pearson")
plot(resid_pearson ~ fitted(ethnicity_agegroupno_glmmodel_nb), main = "Pearson residuals vs Fitted")
abline(h = 0, col = "red")

Code
overdisp_fun <- function(ethnicity_agegroupno_glmmodel_nb) {
  rdf <- df.residual(ethnicity_agegroupno_glmmodel_nb)
  rp <- residuals(ethnicity_agegroupno_glmmodel_nb, type = "pearson")
  Pearson.chisq <- sum(rp^2)
  prat <- Pearson.chisq / rdf
  pval <- pchisq(Pearson.chisq, df = rdf, lower.tail = FALSE)
  c(chisq = Pearson.chisq, ratio = prat, rdf = rdf, p = pval)
}

overdisp_fun(ethnicity_agegroupno_glmmodel_nb)
      chisq       ratio         rdf           p 
536.1511422   1.0135182 529.0000000   0.4054191 
Code
#ratio is around 1 which is good and not too indicative of overdispersion
#pp-value is greater that P<0.005 so that means there is no significant overdispersion
Code
arm::binnedplot(fitted(ethnicity_agegroupno_glmmodel_nb), 
           residuals(ethnicity_agegroupno_glmmodel_nb, type = "response"), 
           main = "Binned residual plot")

Code
# fan shape  heteroskedasticity (non-constant variance)- coon when with count data
Code
ranef(ethnicity_agegroupno_glmmodel_nb)$cond
$Year
      (Intercept)
2019 -0.113568696
2020 -0.007917401
2021  0.225172204
2022  0.316060220
2023 -0.116572575
2024 -0.134986052
2025 -0.168400217
Code
library(ggplot2)


re <- ranef(ethnicity_agegroupno_glmmodel_nb)
cond_re <- re$cond
year_re <- as.data.frame(cond_re$Year)
year_re$Year <- rownames(year_re)

ggplot(year_re, aes(x = factor(Year), y = `(Intercept)`)) +
  geom_point(size = 3) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  labs(title = "Random Effect Intercepts by Year",
       x = "Year",
       y = "Random Effect Estimate (Intercept)") +
  theme_minimal()

Code
#dashed zero is the overall random intercept
# points above zero - years with above average random effect
#points below zero- years below average random effect
Code
performance::check_collinearity(ethnicity_agegroupno_glmmodel_nb)
# Check for Multicollinearity

Low Correlation

      Term  VIF   VIF 95% CI adj. VIF Tolerance Tolerance 95% CI
 Ethnicity 2.48 [2.18, 2.84]     1.57      0.40     [0.35, 0.46]
 Age_Group 2.48 [2.18, 2.84]     1.57      0.40     [0.35, 0.46]
Code
performance::check_model(ethnicity_agegroupno_glmmodel_nb, verbose = TRUE, residual_type = "normal")

Code
sjPlot::tab_model(ethnicity_agegroupno_glmmodel_nb)
  Maternal_Deaths
Predictors Incidence Rate Ratios CI p
(Intercept) 30.37 26.39 – 34.96 <0.001
Ethnicity [Black,
Non-Hispanic]
8.49 8.06 – 8.94 <0.001
Ethnicity [White,
Non-Hispanic]
10.86 10.31 – 11.44 <0.001
Ethnicity [Hispanic] 4.68 4.44 – 4.94 <0.001
Ethnicity [American
Indian or Alaska Native,
Non-Hispanic]
0.62 0.55 – 0.70 <0.001
Ethnicity [Unknown] 3.72 3.52 – 3.93 <0.001
Age_Group25-39 years 4.96 4.78 – 5.15 <0.001
Age Group [40 years and
over]
1.03 0.99 – 1.08 0.125
Random Effects
σ2 0.01
τ00 Year 0.03
ICC 0.71
N Year 7
Observations 539
Marginal R2 / Conditional R2 0.950 / 0.986

Modeling and Results

  • Explain your data preprocessing and cleaning steps.

  • Present your key findings in a clear and concise manner.

  • Use visuals to support your claims.

  • Tell a story about what the data reveals.

Conclusion

  • Summarize your key findings.

  • Discuss the implications of your results.

References

Agresti, A. 2015. Foundations of Linear and Generalized Linear Models. Wiley Series in Probability and Statistics. Wiley. https://books.google.com/books?id=jlIqBgAAQBAJ.
Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. “Fitting Linear Mixed-Effects Models Using lme4.” Journal of Statistical Software 67 (1): 1–48. https://doi.org/10.18637/jss.v067.i01.
Candy, Steven G. 2000. “The Application of Generalized Linear Mixed Models to Multi-Level Sampling for Insect Population Monitoring.” Environmental and Ecological Statistics 7 (3): 217–38.
Lee, Youngjo, and John A Nelder. 1996. “Hierarchical Generalized Linear Models.” Journal of the Royal Statistical Society Series B: Statistical Methodology 58 (4): 619–56.
McGillycuddy, Maeve, David I. Warton, Gordana Popovic, and Benjamin M. Bolker. 2025. “Parsimoniously Fitting Large Multivariate Random Effects in glmmTMB.” Journal of Statistical Software 112 (1): 1–19. https://doi.org/10.18637/jss.v112.i01.
Owili, Patrick Opiyo, Tang-Huang Lin, Miriam Adoyo Muga, and Wei-Hung Lien. 2020. “Impacts of Discriminated PM2. 5 on Global Under-Five and Maternal Mortality.” Scientific Reports 10 (1): 17654.
R Core Team. 2025. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Salinas Ruı́z, Josafhat, Osval Antonio Montesinos López, Gabriela Hernández Ramı́rez, and Jose Crossa Hiriart. 2023. Generalized Linear Mixed Models with Applications in Agriculture and Biology. Springer Nature.
Tawiah, Kassim, Samuel Iddi, and Anani Lotsi. 2020. “On Zero-Inflated Hierarchical Poisson Models with Application to Maternal Mortality Data.” International Journal of Mathematics and Mathematical Sciences 2020 (1): 1407320.
Thall, Peter F. 1988. “Mixed Poisson Likelihood Regression Models for Longitudinal Interval Count Data.” Biometrics, 197–209.
Thompson, Jennifer A, Clemence Leyrat, Katherine L Fielding, and Richard J Hayes. 2022. “Cluster Randomised Trials with a Binary Outcome and a Small Number of Clusters: Comparison of Individual and Cluster Level Analysis Method.” BMC Medical Research Methodology 22 (1): 222.
Wang, Ke-Sheng, Xuefeng Liu, Muyiwa Ategbole, Xin Xie, Ying Liu, Chun Xu, Changchun Xie, and Zhanxin Sha. 2017. “Generalized Linear Mixed Model Analysis of Urban-Rural Differences in Social and Behavioral Factors for Colorectal Cancer Screening.” Asian Pacific Journal of Cancer Prevention: APJCP 18 (9): 2581.
Zuur, Alain F, Elena N Ieno, Neil J Walker, Anatoly A Saveliev, Graham M Smith, et al. 2009. Mixed Effects Models and Extensions in Ecology with r. Vol. 574. Springer.